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Abstract 

It is commonly believed that the correlations between stock returns in- 
crease in high volatility periods. We investigate how much of these correla- 
tions can be explained within a simple non-Gaussian one-factor description 
with time independent correlations. Using surrogate data with the true 
market return as the dominant factor, we show that most of these correla- 
tions, measured by a variety of different indicators, can be accounted for. 
In particular, this one- factor model can explain the level and asymmetry 
of empirical exceedance correlations. However, more subtle effects require 
an extension of the one factor model, where the variance and skewness of 
the residuals also depend on the market return. 

1 Introduction 

Understanding the relationship between the statistics of individual stock returns 
and that of the corresponding index is a major issue in several finance problems 
such as risk management [p]] or market micro-structure modeling. It is also crucial 
for building optimized portfolios containing both index and stocks derivatives 
H, [|. Although the index return is the (weighted) sum of stock returns, it 
actually displays very different statistical properties from what would result if 
the stock returns were independent. In particular, the cumulants (that is, the 
volatility, the skewness and the kurtosis) of the index distribution, which should 
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be suppressed by a power of the number of stocks N for independent returns, 
are still very large, even for iV = 500. The negative skewness of the index, in 
particular, is actually larger than for individual stocks, and reflects a specific 
leverage effect Q. 

It is a common belief that cross-correlations between stocks actually fluctuate 
in time, and increase substantially in a period of high market volatility. This has 
been discussed in many papers - see for example ||, with more recent discussions, 
including new indicators, in @, 0, §, [|J. Furthermore, this increase is thought to be 
larger for large downward moves than for large upward moves. The dynamics of 
these correlations themselves, and their asymmetry, should be estimated, leading 
to rather complex models |7], 10|, |ll|, [12|] . The view of 'moving' correlations has a 
direct consequence for risk management: the risk for a given portfolio is seen as 
resulting from both volatility fluctuations and correlation fluctuations. 

An alternative point of view is provided by factor models with a fixed correla- 
tion structure. The simplest version contains a unique factor - the market itself. 
In this case, the time fluctuations of the measured cross- correlations between 
stocks is, as we show below, directly related to the fluctuations of the market 
volatility. The notion of "correlation" risk therefore reduces to market volatility 
risk, which considerably simplifies the problem. In this paper, we want to address 
to what extent a non- Gaussian one-factor model is able to capture the essential 
features of stocks cross-correlations, in particular in extreme market conditions. 
Our conclusion is that most of the extreme risk correlations, measured by differ- 
ent indicators, are actually captured by this simple fixed- correlation model. This 
model is able to reproduce quantitatively the observed exceedance correlations || 
without invoking the idea of 'regime switching' recently advocated in this context 
in EE- 

However, a more detailed analysis shows that a refined model is needed to ac- 
count for the dependence of the conditional volatility and skewness of the residuals 
on the market return. 



2 A non-Gaussian one-factor model 

We want to compare empirical measures of correlation with the prediction of a 
fixed- correlation model. However, for generic non-Gaussian probability distribu- 
tions of returns, there is no unique way of building a multivariate process. A 
natural choice is to assume that the return of every stock is the sum of random 
independent (non-Gaussian) factors. While a multivariate Gaussian process can 
always be decomposed into independent factors, this is not true for generic non- 
Gaussian distributions. The existence of such a decomposition is thus part of the 
definition of our model. 
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The model: We will call market the dominant factor in this decomposition 
and write: 

n{t) = Pir m (t) + ei(t). (1) 

The daily return is defined as r»(t) = Si(t)/Si(t — 1) — 1, where Si(t) is the 
value of the stock % on day t. The return is thus decomposed into a market part 
r m (t) and a residual part e*(t). In a generic factor model, the residuals ej(t) are 
combinations of all the factors except the market and are therefore independent 
of it. The one-factor model corresponds to the simple case where the €i(t) are 
also independent of one another. 

The market is defined as a weighted sum of the returns of all stocks. The 
weights can be those of a market index such as the S&P 500. These could also be 
the components of the eigenvector with the largest eigenvalue of the stocks cross- 
correlation matrix [I3|. We have chosen to work simply with uniform weights, 



leading to the following definition: 

1 N 

JV i=l 

Had we chosen another weighting scheme for the definition of the market, the 
theoretical results below would still hold exactly provided that we replace averages 
over all stocks by the corresponding weighted averages. On our data set, the 
different weighted averages give essentially the same results. The coefficients fa's 
are then given by: 

a _ feVm) - (ri)(r m ) 

fU ~ (rl)-(r m ? ' {6) 
where the brackets refer to time averages. This model is meaningful in the case 
where fa is constant or slowly varying in time. Eq. (0) immediately implies 
(l/N)T» 1 /3 i = l. 

An important qualitative assumption of this model is that although the mar- 
ket is built from the fluctuations of the stocks, it is a more fundamental quantity 
than the stocks themselves. Hence, one cannot expect to explain the statistical 
properties of the market from those of the stocks within this model. 

Real data and surrogate data: The data set we considered is composed of 
the daily returns of 450 U.S. equities among the most liquid ones from 1993 up 
to 1999. In order to test the validity of a one-factor model, we also generated 
surrogate data compatible with this model. Very importantly, the one-factor 
model we consider is not based on Gaussian distributions, but rather on fat- 
tailed distributions that match the empirical observations for both the market 



and the stocks daily returns ]H| . 



The procedure we used to generate the surrogate data is the following: 
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• Compute the /3j's using Eq. ([3]) over the whole time period |I5J. These /3j's 
are found to be rather narrowly distributed around 1, with (1/iV) Y^Li Pf — 
1.05. 

• Compute the variance of the residuals of = of — Pfcr^. On the dataset we 
used o m was 0.91% (per day) whereas the rms <r ei was 1.66%. 

• Generate the residual e»(t) = a e .Ui(t), where the -Uj(t) are independent 
random variables of unit variance with a leptokurtic (fat tailed) distribution 

- we have chosen here a Student distribution with an exponent \i = 4: 

p « = pT^' (4) 

which is known to represent adequately the empirical data fL4| . 

• Compute the surrogate return as rf urr (t) = Pir m (t) + €i(i), where r m (t) is 
the true market return at day t. 

Therefore, within this method, both the empirical and surrogate returns are based 
on the very same realization of the market statistics. This allows us to compare 
meaningfully the results of the surrogate model with real data, without further 
averaging. It also short-cuts the precise parameterization of the distribution of 
market returns, in particular its correct negative skewness, which turns out to be 
crucial. 



3 Conditioning on large returns 

Conditioning on absolute market return: We have first studied a measure 
of correlations between stocks conditioned on an extreme market return. It is 
indeed commonly believed that cross-correlations between stocks increase in such 
"high- volatility" periods. A natural measure is given by the following coefficient: 

P>(A) " ^Ed(rf)>,-(r,)i x ) ' (5) 

where the subscript > A indicates that the averaging is restricted to market 
returns r m in absolute value larger than A. For A = the conditioning disap- 
pears. Note that the quantity p > is the average covariance divided by the average 
variance, and therefore differs from the average correlation coefficient. We have 
studied the latter quantity, and the following conclusions remain valid in this case 
also. 

In a first approximation, the distribution of individual stocks returns can be 
taken to be symmetrical, leading (rj)>^ ~ 0. The above equation can therefore 
be transformed into: 

P>W* iS iX Ly (6) 
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Figure 1: Correlation measure p>(A) conditional to the absolute market return 
to be larger than A, both for the empirical data and the one-factor model. Note 
that both show a similar apparent increase of correlations with A. This effect is 
actually overestimated by the one-factor model with fixed residual volatilities. A 
is in percents. 



where cr^(A) is the market volatility conditioned to market returns in absolute 
value larger than A, and of (A) = { r f)>\ — ( r t)>A- m the context of a one-factor 
model, we therefore obtain: 

(\\ — °m(A) /j\ 

(iEf =1 A 2 )^(A) + ^Ef =1 <- 

The residual volatilities of. are independent of r m and therefore of A whereas 
<t^(A) is obviously an increasing function of A. Hence the coefficient p>(A) is an 
increasing function of A. The one-factor model therefore predicts an increase of 
the correlations (as measured by p>(A)) in high volatility periods. This conclusion 
is quite general, it holds in particular for any factor model, even with Gaussian 
statistics. Therefore, the very fact of conditioning the correlation on large market 
returns leads to an increase of the measured correlation. A similar discussion in 
the context of Gaussian models can be found in |J. 

More precisely, we can now compare the coefficient p>(A) measured empiri- 
cally to one obtained within the one-factor model defined above. This is presented 
in Fig. [I]. Interestingly, the surrogate and empirical correlations are similar, dis- 
playing qualitatively the same increase of the cross- correlation when conditioned 
to large market returns. This shows that a one-factor model does indeed account 
quantitatively for the apparent increase of cross-correlations in high volatility 
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Figure 2: Conditional probability that a stock has the same sign as the market 
return as a function of the market return; r m is in percent. Each cross represents 
the empirical probability using 4% of the days centered around a given market 
return. The dotted line is the prediction of the non-Gaussian one-factor model. 



periods. 

The one-factor model actually even overestimates the correlations for large 
A. This overestimation can be understood qualitatively as a result of a positive 
correlation between the amplitude of the market return |r m | and the residual 
volatilities a H , which we discuss in more details in Section 4 below (see in par- 
ticular Fig. 5). For large values of A, a €i is found to be larger than its average 
value. From Eq. (|7|), this lowers the correlation p>(A) as compared to the simplest 
one-factor model where the volatility fluctuations of the residuals are neglected. 

Conditional fraction of positive/negative returns: Another quantity of 
interest is the fraction of stocks returns having the same sign as the market 
return, as a function of the market return itself. The empirical results are shown 
on Fig. 0. We observe that for the largest returns, 90% of the stocks have the 
same return sign as that of the market. Therefore, the sign of the market appears 
to have a very strong influence on the sign of individual stock returns. 

This fraction can be calculated exactly within the one-factor model. Focusing 
on positive market return (the case of negative returns can be treated similarly), 
a stock return r, is positive whenever e$ > —f3ir m . Therefore the average fraction 
f(t) of stocks having a positive return for a given market return r m (t) is 

/(*) = ^E^<(^^)> (8) 
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where V < is the cumulative normalized distribution of the residual (chosen here 
to be a Student distribution with a exponent \i = 4). f(t) is also plotted on Fig. || 
and fits well the empirical results. The theoretical estimate slightly overestimates 
the fraction f(t) for positive market returns. As explained above, the correlations 
between a ti and \r m \ do lower f(t) as needed for the positive side. However, the 
corresponding fraction for the negative side would then be underestimated. 



Conditioning on large individual stock returns — quantile correlations 
and exceedance correlations: Since the volatility of the residuals is two times 
larger than the volatility of the market, the conditioning by extreme market 
events does not necessarily select extreme individual stock moves. The quantities 
studied in the previous section, namely return correlations and sign correlations, 
are therefore more related to the central part of the stocks distribution rather 
than to their extreme tails. We now study more specifically how extreme stock 
returns are correlated between themselves. A first possibility is to study quantile 
correlations, that we define as: 

iEi{(rf) q - (ri) 2 q ) 

where the subscript q indicates that we only retain in the average days such that 
both \ri\ and \rj\ take their q— quantile value, within a certain tolerance level 
(this tolerance is taken to be 4% of the total interval for each quantile). In the 
limit q — > 1, this selects extremes days for both stocks i and j simultaneously. 
The empirical results for p(q) are compared with the prediction of the one-factor 
model in Fig. |3|. The agreement is again very good, though the one-factor model 
still slightly overestimates the true correlations in the extremes. 

Another interesting quantity that has been much studied in the econometric 
literature recently, is the so-called exceedance correlation function, introduced in 
P|. One first defines normalized centered returns fj with zero mean and unit 
variance. The positive exceedance correlation between i and j is defined as: 

o+(6) = Sfkkf ~ ^j)>e^j)>o QO) 

^m>e-{h)iem)>e-{f ] )i e y 

where the subscript > 9 means that both normalized returns are larger than 9. 
Large #'s correspond to extreme correlations. The negative exceedance correla- 
tion p^j{—9) is defined similarly, the conditioning being now on returns smaller 
than —9. Fig. f| shows the exceedance correlation function, averaged over the 
pairs i and j, both for real data and for the surrogate one-factor model data. As 
in previous papers, we have shown pfj(9) for positive 9 and Pij{—9) for negative 
9. As in previous studies @, 0, i, we find that p ± {±9) grows with \9\ and is larger 
for large negative moves than for large positive moves. This is in strong contrast 
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Figure 3: Correlation between stocks for joint extreme moves, p{q), as a function 
of the quantile value q, both for real data and the surrogate one-factor model. 



with the prediction of a Gaussian model, which gives a symmetric tent-shaped 
graph that goes to zero for large \9\. Note however that previous studies have 
focused on fixed pairs of assets i and j (for example a few pairs of international 
markets). The result of Fig. |] is interesting since it reveals a systematic effect 
over all pairs of a pool of 450 stocks. 

Several models have been considered to explain the observed results |7], 
Simple GARCH or Jump models cannot account for the shape of the exceedance 
correlations. Qualitatively similar graphs can however be reproduced within a 
rather sophisticated 'regime switching' model, where the two assets switch be- 
tween a positive, low volatility trend with small cross-correlations and a negative, 
high volatility trend with large cross-correlations. Note that by construction, this 
'regime switching' model induces a strong skew in the 'index' (i.e. the average 
between the two assets). Fig. |] however clearly shows that a fixed correlation 
non Gaussian one-factor model is enough to explain quantitatively the level and 
asymmetry of the exceedance correlation function. In particular, the asymmetry 
is induced by the large negative skewness in the distribution of index returns, 
and the growth of the exceedance correlation with \9\ is related to distribution 
tails fatter than exponential (in our case, these tails are indeed power- laws). 

4 Conditional statistics of the residuals 

We conclude from the above results that the observed fluctuations of the stock 
cross-correlations are mainly a consequence of the volatility fluctuations and 
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Figure 4: Average exceedance correlation functions between stocks as a function 
of the level parameter 9, both for real data and the surrogate one-factor model. 
We have shown pfj(9) for positive 6 and Pij(-O) for negative 6. Note that this 
quantity grows with \6\ and is strongly asymmetric. 

skewness of the market return, and that a non Gaussian one-factor model does 
reproduce satisfactorily most of the observed effects. However, some small sys- 
tematic discrepancies appear, and call for an extension of the one-factor model. 
The most obvious effect not captured by a one-factor model is the recently discov- 
ered 'ensemble' skewness in the daily distribution of stock returns, as discussed 
by Lillo and Mantegna ||16|| . More precisely, they have shown that the histogram 
of all the stocks returns for a given day displays on average a positive skewness 
when the market return is positive, and a negative skewness when the market 
return is negative. The amplitude of this skewness furthermore grows with the 
absolute value of the market return. Note that this skewness is not related to the 
possible non zero skewness of individual stocks that has been recently discussed 
in several papers in relation with extended CAPM models [12|, [17j . 

Clearly, this 'ensemble' skewness that depends on the market return cannot be 
explained by the above one-factor model where the residuals have a time indepen- 
dent zero skewness. The one-factor model is certainly an oversimplification of the 
reality: although the market captures the largest part of the correlation between 
stocks, industrial sectors are also important, as can be seen from a diagonaliza- 
tion of the correlation matrix [JTSJ . Large moves of the market can be dominated 
by extreme moves of a single sector, while the other sectors are relatively unaf- 
fected. This effect does induce some skewness in the fixed-day histogram of stock 
returns distribution. 
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Figure 5: Daily residual 'volatility' £ (Eq. (|11|)) averaged over the different stocks 
for a given day as a function of the market return for the same day. E is in percent. 



A way to account for this effect is to allow the distribution of the residual ej(t) 
to depend on the market return r m (t). In order to test this idea, we have studied 
directly some moments of the distribution of the residuals for a given day as a 
function of the market return that particular day. We have studied the following 
quantities: 

£ = [h - Ml], (ii) 
5 = M -M ! d M _ (l2) 

K = [£ - ] ~ 2 h]2 , (13) 

where the square brackets [...] means that we average over the different stocks 
for a given day and Med selects the median value of £j. These three quantities 
should be thought as robust alternatives to the standard variance, skewness and 
kurtosis, which are based on higher moments of the distribution. 

The quantity £ measures the 'volatility' of the residuals and is shown in Fig. [| 
as a function of |r m (t)|. A linear regression is also shown for comparison. It is 
clear that there is a positive correlation between the market volatility and the 
volatility of the residuals, not captured by the simplest one-factor model. As 
explained above, this effect actually allows one to account quantitatively for the 
systematic overestimation of the observed correlations. 

In order to confirm the skewness effect of Lillo and Mantegna, we have then 
studied the quantity S. This quantity is positive if the distribution is positively 
skewed. Fig. shows a scatter plot of S as a function of r m [T9|]. Again these 
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Figure 6: Daily residual 'skewness' S averaged over the different stocks for a given 
day as a function of the market return for the same day Note that the skewness 
is computed using low moments of the distribution to reduce the measurement 
noise and does not correspond to the usual definition (see Eq. (0)). 
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two quantities are positively correlated, as emphasized by Lillo and Mantegna 
(although their analysis is different from ours). 

Therefore, both the volatility and the skew of the residuals are quite strongly 
correlated with the market return. One could wonder if higher moments of the 
distribution are also sensitive to the value of r m . We have therefore studied the 
quantity /C as one possible refined measure of the shape of the distribution of 
residuals. This is shown in Fig. [7] and reveals a much weaker dependence than 
the previous two quantities. 



5 Conclusion 

We have thus shown that the apparent increase of correlation between stock 
returns in extreme conditions can be satisfactorily explained within a static one- 
factor model which accounts for fat-tail effects. In this model, conditioning on 
a high observed volatility naturally leads to an increase of the apparent cor- 
relations. The much discussed exceedance correlations can also be reproduced 
quantitatively and reflects both the non-Gaussian nature of the fluctuations and 
the negative skewness of the index, and not the fact that correlations themselves 
are time dependent. 

This one-factor model is however only an approximation to the true correla- 
tions, and more subtle effects (such as the Lillo-Mantegna 'ensemble' skewness) 
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Figure 7: Daily residual 'kurtosis' averaged over the different stocks for a given 
day as a function of the market return for the same day. Again, the kurtosis 
is computed using low moments of the distribution to reduce the measurement 
noise and does not correspond to the usual definition (see Eq. (|13|)). 

require an extension of the one factor model, where the variance and skewness of 
the residuals themselves depend on the market return. 
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